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BLIND  DECONVOLUTION  TO  IMPROVE  CLASSIFICATION 
OF  TRANSIENT  SOURCE  SIGNALS  IN  MULTIPATH 


1.  INTRODUCTION 

Automated  classification  of  transient  source  signals  has  become  a  necessity  as  Navy  operations  have 
shifted  from  the  open  ocean  to  the  littoral  environment.  The  multitude  of  transient  signals  received  on  a 
passive  sonar  system  from  biologies  and  surface  vessels  can  overwhelm  the  sonar  operator  monitoring  an 
area  for  targets  of  interest.  Automation  offers  a  means  to  identify  and  remove  obvious  false  targets  to  signifi¬ 
cantly  reduce  the  number  of  potential  targets  for  the  sonar  operator  to  classify.  A  difficult  problem  even  in 
simple  cases,  classification  increases  enormously  in  complexity  when  the  received  transient  has  been  dis¬ 
torted  by  multipath,  which  can  be  significant  at  operational  detection  ranges  in  shallow  water. 

The  received  signal  in  the  underwater  multipath  environment  x(t)  is  modeled  by  the  convolution  opera¬ 
tion  between  the  transient  source  signal  s(t)  and  a  Green’s  function  g(t)  that  represents  the  multipath  encoun¬ 
tered  by  the  source  signal  as  it  travels  to  the  receiver.  When  x(t)  =  s(t )  *  git)  and  git)  is  known,  deconvolution 
is  used  to  obtain  5(f).  Removing  the  multipath  distortions  in  the  signal,  which  are  especially  detrimental 
when  multiple  returns  in  the  received  signal  overlap,  allows  more  accurate  classification  of  the  source  sig¬ 
nal.  Given  the  inherent  uncertainties  in  underwater  environmental  data,  it  is  virtually  impossible  to  model  a 
Green’s  function  to  represent  multipath  with  sufficient  accuracy  for  deconvolution  to  produce  the  correct 
source  signal  [1],  However,  the  source  signal  may  be  estimated  using  blind  deconvolution.  Blind  deconvolution 
techniques  do  not  require  known  Green’s  functions  but  instead  use  assumptions  about  the  statistical  proper¬ 
ties  of  the  Green’s  functions  to  obtain  the  source  signal.  One  such  technique  is  the  subject  of  this  report. 

Section  2  of  this  report  contains  a  more  detailed  description  of  the  blind  deconvolution  problem  in 
seismology  and  underwater  acoustics  and  Section  3  gives  one  specific  technique  (the  Cabrelli  algorithm). 
Section  4  shows  the  results  of  some  parameter  studies  using  the  Cabrelli  algorithm  and  discusses  the  results 
of  the  simulation  studies  and  future  directions  for  research.  Section  5  gives  the  conclusion. 

2.  BLIND  DECONVOLUTION  TECHNIQUES 

Blind  deconvolution  is  commonly  used  to  mitigate  multipath  distortion  that  occurs  over  communication 
channels.  It  is  also  used  to  reduce  blurring  in  image  restoration.  More  closely  related  to  the  blind  deconvolution 
problem  in  underwater  acoustics  is  the  seismic  application,  which  attempts  to  retrieve  the  reflection  coeffi¬ 
cients  for  layers  within  the  Earth  without  exact  knowledge  of  the  input  waveform.  The  similarity  is  that  in 
both  the  underwater  acoustics  and  seismic  applications,  the  multipath  function  (Green’s  function  for  under¬ 
water  acoustics,  reflectivity  series  for  seismology)  is  well  represented  by  a  series  of  spikes  interspersed  with 
small,  near-zero  values.  Each  application  assumes  that  the  multipath  function  is  “sparse.” 

One  thoroughly  investigated  blind  deconvolution  technique  proposed  by  the  seismic  community  is  the 
Wiggins’  minimum  entropy  deconvolution  technique  [2],  This  was  the  initial  impetus  for  investigating  the 
multipath  problem  in  transient  classification  [3-5].  The  Wiggins’  method  is  now  considered  one  of  a  series 
of  cumulant  maximization  techniques  [6,  7],  These  techniques  assume  that  the  convolution  of  a  multipath 
function  with  a  source  signal  produces  a  received  signal  that  is  less  sparse  than  the  multipath  function  since 
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convolution  is  a  smoothing  process.  The  methods  attempt  to  derive  filters  that  drive  the  received  signal  to 
higher  sparseness  to  obtain  multipath  functions  that  may  then  be  used  to  implement  deconvolution  to  obtain 
the  source  signal.  Wiggins’  measure  of  sparseness  is  the  V-norm, 


V(y)  =  J>y7  I 


\2 


,yj 


\  j 


which  is  essentially  the  same  as  kurtosis.  Wiggins’  technique  can  be  recast  using  non-Gaussianity  as  the 
statistical  feature  of  the  multipath  function  that  is  reduced  (i.e.,  made  more  Gaussian)  by  the  process  of 
convolution.  Maximization  of  the  V-norm  leads  to  an  iterative  technique  that  has  been  shown  to  be  quite 
sensitive  to  noise  [8]. 

Later,  Cabrelli  [9]  introduced  a  technique  analogous  to  that  of  Wiggins  in  many  respects.  Most  notably, 
Cabrelli’s  method  uses  the  D-norm, 


D(  y)  =  max 

1  <j<m ' 
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as  a  measure  of  sparseness  and,  like  the  V-norm  of  the  Wiggins’  algorithm,  the  D-norm  tracks  closely  with 
kurtosis.  However,  comparative  studies  show  that  the  Cabrelli  algorithm  is  less  sensitive  to  noise  than  the 
Wiggins  algorithm  [5]  and  is  better  suited  to  the  underwater  acoustics  application  of  blind  deconvolution. 
Thus,  the  following  presentation  is  limited  to  the  Cabrelli  method. 


3.  CABRELLPS  ALGORITHM 

Cabrelli  starts  with  a  simple  but  reasonable  signal  model.  The  time  series  arriving  at  the y'th  hydrophone 
of  an  array  is 


Xj  =  s*gj+nj,  (1) 

where  5  is  the  signal  leaving  the  target  of  interest,  g  is  the  Green’s  function  describing  the  propagation  from 
the  target  to  the  jih  hydrophone,  and  nj  is  the  noise  on  the y'th  hydrophone. 

A  filter/ of  length  /  is  defined  that  acts  on  all  of  the  data  channels  to  produce  filtered  output, 

yj=f*Xj=f*s*gj+f*  nj  .  (2) 

It  is  clear  from  Eq.  (2)  that  a  desirable  filter  would  render  the  first  term  equal  to  the  Green’s  function 
while  correlating  poorly  with  the  noise  term.  Such  a  filter  would  be 

f=s\  (3) 

since  the  first  term  would  be  rendered  equal  to  the  Green’s  function  while  ,s,  and  hence  its  inverse,  is  totally 
independent  of  the  noise.  If  such  a  filter  could  be  found,  then  the  desired  signal  could  be  extracted  as  its 
inverse,  and  the  Green’s  function  estimate  for  each  data  channel  could  be  found  simply  as  the  filtered  data 

>)  =  f*  Xj  =  .s'" 1  *  v  *  g.  =  gj . 


(4) 


Blind  Deconvolution  to  Improve  Classification  of  Transient  Source  Signals 


3 


Cabrelli  attempts  to  find  such  a  filter  by  assuming  that  the  Green’s  functions  are  considerably  sparser 
than  the  original  data.  The  main  assumption  is  that  increasing  the  non-Gaussianity  of  the  data  will  drive  it 
toward  the  Green’s  function.  Thus,  by  driving  the  sparseness  of  the  original  data  up  (by  way  of  the  filter/), 
Eq.  (4),  and  hence,  Eq.  (3)  could  be  satisfied.  As  a  measure  of  sparseness,  he  used  the  simplicity  or  D-norm 
defined  by 


(5) 


where 


Y 


(6) 


is  the  Euclidean  norm  of  the  matrix  Y  =  [y  and  yjk  is  the  klh  bin  of  the  filtered  data  from  the y'th  hydro¬ 
phone.  Cabrelli  obtained  this  norm  via  geometric  arguments,  and  showed  the  relationship  of  this  norm  with 
the  varimax  (kurtosis)  norm  used  by  Wiggins.  Cabrelli  argues  that  optimization  of  the  D-norm  will  produce 
an  estimate  of  a  filter  that  satisfies  Eqs.  (3)  and  (4). 

To  optimize  the  D-norm,  derivatives  of  y  and  II  T||  will  be  required.  These  derivatives  are  easily  calcu¬ 
lated: 


dYj,k  d 


3/. 


-  I  fnXj,k-n  I  Xj,k-n  _  ^j^Xj,k-n^nm  Xj, 
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and 
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It  will  be  convenient  later  to  express  this  latter  derivative  in  terms  of  the  data  autocorrelations, 


as 


*«-»!  ^ 'jXj,k-nXj,k-m  ’ 


a||y| 

3/. 


Y/r 

J  n  n  -n 
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To  optimize  the  D-norm,  the  normal  equations  for  the  filter  coefficients  are  obtained  by  setting  the 
derivatives  with  respect  to  the  filter  coefficients  equal  to  zero.  This  gives 


fy.*  a  |  y  I 

df  }j’k  df 

J  m  J 


=  o , 


(ii) 


which  can  be  written  as 


x  ,  -  y  , 

j.k-m  -  j,k 


!Y/r  =o, 

Jn  n  -  m 


(12) 


where  Eqs.  (7)  and  (10)  have  been  used  and  where  j  runs  over  hydrophones  while  k  runs  over  bins. 
By  constructing  an  autocorrelation  matrix  R  whose  elements  are  given  by 


R  —  F 

nm  1 n-w  ' 


the  normal  equations  can  be  written  in  matrix  form  as 


3U  Y\-  R-f  =  xjk. 


(13) 


(14) 


/is  a  column  vector  composed  from  the  filter  coefficients,  and  x’k  is  the  transpose  of  [x} k,  xJk , , ,  x .w],  with 
x  =  0  if  n  does  not  correspond  to  a  bin  number. 


j" 


Notice  that  the  normal  equations  are  invariant  under  the  scale  transformation/— >  A/,  where  A  is  a  scalar 
multiple.  Since  y  k  II  Y 1 1  ~2is  a  scalar,  then  the  simpler  equation 


Jk 


R  f=  x 

has  the  same  solution  set  as  the  original  normal  equations. 

Assuming  that  R  is  well  conditioned  for  inversion, 

fjk  =  R~'  -xjk 

is  evaluated  for  the  set  of  filters  (one  for  each  j,k  pair),  each  filter  is  applied  to  the  original  data  to  get 


(15) 


(16) 


-nut 


-If! 


(17) 


and  the  matrix  Yjk  =  [y/  ]  is  formed  for  each  jk  pair. 
Equation  (5),  in  the  form 


(18) 
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can  now  be  evaluated  to  select  the  optimum  filter  from  the  set  {fjk}.  The  optimum  filter  is  then  inverted  to 
give  the  signal  estimate.  If  the  Green’s  functions  are  desired,  they  are  given  by  they'*  used  to  compose  Yjk  for 
the  optimal  jk  pair  that  has  already  been  calculated  to  evaluate  the  D-norm  in  Eq.  (18). 

Cabrelli  denotes  the  process  just  described  as 


sup. 


{*>} 


sup. 


max;jt 


y  j.t 


>  =  ma  xjki 


sup. 


y  j,k 


(19) 


The  first  statement  in  Eq.  (19)  says  to  choose  the  filter  that  optimizes  the  D-norm — an  obvious  state¬ 
ment  of  the  problem.  However,  the  second  statement  says  that  the  operations  are  interchangeable  and  that  it 
is  much  simpler  to  find  a  set  of  optimal  filters  (one  for  each  jk  pair)  and  then  pick  the  one  that  maximizes  the 
D-norm.  This  process  is  described  by  Eqs.  (16)  through  (18). 

4.  SIMULATIONS 

This  report  gives  three  basic  simulation  analyses  of  the  Cabrelli  algorithm: 

(1)  Cabrelli  performance  vs  signal  complexity, 

(2)  Cabrelli  performance  vs  Green’s  function  complexity,  and 

(3)  a  classification  operating  characteristics  (CLOC)  analysis  of  Cabrelli  performance  vs  signal- 
to-noise  ratio  (SNR). 

In  all  three  studies  the  received  data  were  simulated  according  to  Eq.  (1)  by  convolving  artificially  con¬ 
structed  transient  signals  with  artificially  constructed  Green’s  functions.  Noise  was  added  only  in  the  third 
study.  The  constructed  transients  and  Green’s  functions  were  different  for  each  of  the  three  studies  and  are 
described  separately  for  each  study. 

Performance  of  the  Cabrelli  algorithm  was  evaluated  by  calculating  a  normalized  correlation  between 
the  transient  signal  that  Cabrelli’s  algorithm  estimates  and  the  actual  transient  used  to  simulate  the  data.  The 
normalization  was  performed  according  to 


y(a,b) 


max(a  ®  b) 


j 


max  (a  ®  a) 


V 


max  ( b  ®  b) 


(20) 


(where  ®  represents  correlation)  for  the  normalized  correlation  between  two  signals  a  and  b.  Although  most 
classifiers  are  generally  much  more  sophisticated  and  involve  many  signal  features,  this  performance  mea¬ 
sure  is  adequate  for  these  studies  that  compare  only  the  quality  of  the  source  estimate  with  no  processing 
(the  received  signal)  with  the  source  estimate  obtained  from  the  blind  deconvolution  algorithm. 

It  is  assumed  that  the  passband  of  the  received  signal  has  been  estimated  prior  to  this  stage  of  process¬ 
ing.  With  this  assumption,  the  received  signal  was  bandpass  filtered  before  it  was  sent  to  the  Cabrelli  algo¬ 
rithm,  and  the  signal  estimate  produced  by  the  Cabrelli  algorithm  was  again  bandpass  filtered  before  it  was 
sent  to  the  classifier  (i.e.,  before  the  normalized  correlation  was  calculated). 
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4.1  Signal  Complexity 

The  waveform  type  that  Cabrelli  originally  developed  his  algorithm  for  was  the  simple  wavelet 
signal  commonly  used  in  seismic  petroleum  exploration.  This  naturally  raises  the  concern  that  this  algo¬ 
rithm  might  not  perform  adequately  when  it  is  used  with  the  more  complicated  transient  signals  commonly 
encountered  in  underwater  acoustics.  To  study  the  effect  of  signal  complexity  on  Cabrelli  performance, 
complexity  must  first  be  defined. 

Cabrelli’s  algorithm  does  not  perform  well  for  hyperbolic  frequency-modulated  (HFMs)  or  linear 
frequency-modulated  (LFMs)  signals.  Since  it  does  work  well  on  continuous  wave  (CW)  signals,  the  ques¬ 
tion  arises:  How  does  its  performance  behave  as  a  function  of  signal  bandwidth?  Thus,  the  bandwidth  of  an 
LFM  with  a  fixed  center  frequency  was  used  as  the  measure  of  complexity. 

A  series  of  100-ms  LFMs  were  produced,  each  one  centered  on  100  Hz  with  bandwidths  increasing  from 
0.001  to  180  Hz.  These  trial  LFMs  were  convolved  with  a  set  of  synthetic  Green’s  functions  from  the  set 
described  in  the  next  section  to  make  a  set  of  received  signals  at  the  three  hydrophones  of  an  array.  The 
Cabrelli  algorithm  was  applied  to  each  member  of  this  set  for  all  reasonable  filter  lengths,  and  the  correla¬ 
tions  of  the  outputs  with  their  corresponding  transients  were  calculated  using  Eq.  (20).  Figure  1  shows  the 
maximum  correlation  (over  filter  length)  plotted  against  bandwidth.  The  trend  to  poor  performance  at  high 
bandwidth  for  LFM  signals  is  clearly  demonstrated  by  this  figure.  The  same  trend  is  observed  in  Fig.  2  for 
the  correlations  averaged  over  Green’s  functions. 

Why  does  this  degradation  occur?  One  possibility  is  that  the  simplicity  as  measured  by  the  D-norm 
might  increase  as  bandwidth  increases.  If  the  D-norm  of  the  simulated  transient  (without  multipath)  is  com- 


Fig.  1  —  Normalized  correlations  of  the  transient  signals  extracted  by  Cabrelli’s  algorithm  with  the 
simulated  transient  signal  used  to  construct  the  data  as  a  function  of  signal  complexity  as  measured 
by  the  bandwidth  of  the  LFM  used  in  this  study 
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Fig.  2  —  Normalized  correlations  of  the  Green’s  functions  extracted  by  Cabrelli’s  algorithm  for 
hydrophone  3  with  the  simulated  Green’s  functions  used  to  construct  the  data  as  a  function  of  the 
bandwidth  of  the  LFM  used  in  this  study 


parable  to  the  D-norm  of  the  Green’s  function,  the  performance  of  Cabrelli’s  algorithm  might  be  compro¬ 
mised.  Figure  3  shows  the  D-norm  of  the  simulated  transient  prior  to  convolution  with  the  Green’s  function 
plotted  against  bandwidth.  The  simplicity  increases  with  increasing  bandwidth  from  about  0.135  to  about 
0.1455,  but  this  change  in  the  D-norm  across  the  bandwidth  explored  is  relatively  small.  This  change  was 
only  about  0.0105,  approximately  3%  of  the  variation  in  D-norm  usually  encountered  in  the  loop  over  jk 
pairs  performed  in  the  Cabrelli  algorithm.  The  D-norms  for  the  Green’s  functions  were  0.55, 0.25  and  0.28. 
At  the  low  bandwidth  end,  the  ratios  of  transient  D-norms  to  Green’s  functions  D-norms  for  the  three  hydro¬ 
phones  were  24%,  54%,  and  49%.  At  the  high  end,  they  were  26%,  58%,  and  52%.  It  would  be  hard  to 
imagine  that  this  slight  increase  in  the  percentages  would  produce  such  a  noticeable  change  in  performance. 

4.2  Green’s  Function  Complexity 

The  type  of  Green’s  function  for  which  Cabrelli  originally  developed  his  algorithm  was  the  extremely 
sparse  “reflection  time  series”  commonly  found  in  seismic  petroleum  exploration.  These  Green’s  functions 
are  zero  almost  everywhere,  with  only  a  few  nonzero  spikes  which  are  usually  well  spaced.  This  fact  natu¬ 
rally  raises  the  concern  that  Cabrelli’s  algorithm  might  not  perform  adequately  (or  even  at  all)  when  it  is 
used  with  the  vastly  more  complicated  Green’s  functions  commonly  encountered  in  shallow-water  acous¬ 
tics.  Two  Green’s  function  complexity  studies  were  done.  The  first  used  artificial  Green’s  functions  of  in¬ 
creasing  complexity,  while  the  second  used  Green’s  functions  constructed  with  the  SACLANTCEN 
Normal-Mode  Acoustic  Propagation  Model  (SNAP)  using  the  environmental  parameters  from  a  range  inde¬ 
pendent  run  from  an  acoustically  shallow  site  in  the  Atlantic  Ocean  near  the  Blake  Plateau  [10].  This  latter 
set  of  Green’s  functions  was  made  more  complex  in  a  realistic  way  by  increasing  the  range  from  the  target  to 
the  receiver;  range  was  the  complexity  measure. 
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Fig.  3  —  Values  of  ihc  D-norm  for  the  transient  signals  used  to  construct  the  data 
as  a  function  of  the  bandwidth  of  the  signals 


The  artificial  Green’s  functions  consisted  of  3  to  25  alternate-sign  spikes  modified  with  an  exponential 
decay  to  simulate  decreasing  size  of  returns  with  increasing  range  imposed  on  a  sequence  of  1000  zeros.  The 
spacing  between  spikes  was  prescribed  at  10,  20,  30, 40.  and  50  bins.  The  signal  used  was  a  300-pt  metallic 
bang  (i.e.,  an  exponentially  damped  sinusoid  centered  on  a  frequency  of  20  Hz).  A  single  hydrophone  was 
simulated. 

Figure  4(a)  shows  a  plot  of  the  normalized  correlations  between  the  source  estimate  and  the  true  source 
vs  the  number  of  spikes  in  the  Green’s  function  for  the  case  of  no  deconvolution.  The  multiple  traces  on  this 
plot  correspond  to  different  spike  spacings.  Figure  4(b)  shows  the  same  calculations  for  source  estimates 
obtained  from  the  blind  deconvolution  technique.  Figure  4(a)  shows  what  might  be  expected-performance 
degrades  as  the  complexity  of  the  Green’s  functions  increases.  Except  for  an  initial  result  with  the  10-point 
spacing  and  a  3-spike  Green’s  function,  the  correlation  coefficients  are  less  than  0.9,  and  for  simulated 
Green’s  functions  with  9  or  more  spikes,  the  correlation  coefficients  are  less  than  0.7.  Figure  4(b)  shows 
performance  oscillating  as  the  complexity  of  the  Green's  functions  increases.  The  oscillation  is  probably  an 
artifact  caused  by  the  regular  spacing  of  the  spikes  in  the  artificial  Green’s  functions,  but  the  lack  of  a 
degrading  trend  as  the  number  of  spikes  is  increased  (except  for  the  10-bin  spacing)  is  encouraging.  The 
exceptional  performance  that  occurs  with  a  spacing  of  30  bins  is  an  artifact  of  the  processing  and  should  not 
be  regarded  as  expected  behavior.  The  original  damped  sinusoid  has  a  15-bin  periodicity,  and  the  30-bin 
spacing  between  Green’s  function  spikes  results  in  constructive  interaction  in  the  convolution  process.  The 
Green’s  function  spikes  line  up  with  the  sinusoid  peaks,  leading  to  high  correlations  in  this  example.  With  a 
phase  shift  in  the  damped  sinusoid,  the  Green’s  function  spikes  could  line  up  small  values  within  the  sinu¬ 
soid  and  result  in  correlations  that  are  stable  with  increasing  spike  number  but  significantly  lower  than  that 
shown. 
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(b)  Cabrelli  preprocessing 

Fig.  4  —  Maximum  values  (over  filter  length)  of  the  normalized  correlations  as  functions  of  Green’s 
function  complexity  as  measured  by  the  number  of  spikes  in  the  Green’s  functions 


The  second  Green’s  function  complexity  study  uses  the  same  metallic  bang  but  more  realistic  Green’s 
functions  to  form  the  multipath  corrupted  received  signals  that  would  be  used  for  classification.  The  Green’s 
functions  were  obtained  using  the  SNAP  normal  mode  model  at  differing  ranges  to  produce  a  set  of  received 
signals  on  an  array  of  16  hydrophones  for  each  range.  The  unprocessed  source  signal  estimates  were  pro¬ 
cessed  using  Cabrelli’s  algorithm,  and  the  output  was  compared  to  the  initial  transient  signal  using  Eq.  (20) 
for  a  range  of  filter  lengths.  Figure  5  shows  the  result. 

This  figure  plots  correlations  vs  filter  length  and  range.  At  each  range,  filter  length  can  be  scanned  to  see 
if  there  are  any  filter  lengths  that  produce  correlations  greater  than  the  unprocessed  correlation  (which  is 
plotted  vs  range  in  the  left-most  column  of  the  figure).  As  the  ranges  of  Fig.  5  are  examined,  it  becomes 
apparent  that  at  the  smallest  ranges  (600  and  2400  m)  Cabrelli  output  is  no  better  than  the  unprocessed 
signal.  This  is  not  surprising  since  at  these  ranges  there  is  very  little  multipath  corruption  of  the  signal. 

At  4300  m,  the  unprocessed  signal  correlates  poorly  with  the  initial  transient,  but  there  is  a  range  of 
filter  lengths  around  40  m  for  which  the  Cabrelli  estimated  source  signal  is  significantly  better  than  the 
unprocessed  signal.  (The  small  filter  lengths  are  ignored  because  the  signal  extracted  from  such  a  short  filter 
would  not  have  enough  points  to  give  an  adequate  representation  of  the  initial  transient.)  As  the  rest  of  the 
ranges  are  scanned,  it  becomes  clear  that  at  any  range  there  are  always  sufficiently  large  filter  lengths  for 
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Fig.  5  —  Normalized  correlations  of  the  signals  extracted  from  Cabrclli  preprocessing  with  the  transient  signals 
used  to  construct  the  data  as  functions  of  filter  length  and  target  range  that  were  used  as  a  measure  of  Green’s 
function  complexity 


which  application  of  the  Cabrelli  algorithm  would  be  better  than  no  preprocessing.  The  conclusion  that 
Cabrelli’s  algorithm  can  perform  adequately  with  Green’s  functions  of  the  complexity  found  in  shallow 
water  at  tactical  ranges  is  justified. 

4.3  Classification  Performance  Analysis 

In  this  study,  the  Cabrelli  algorithm  was  run  using  five  different  signals  that  represent  five  different 
classes  of  transients.  Noise  levels  at  five  SNRs  were  tested,  with  SNR  defined  as  the  ratio  of  signal  and 
noise  standard  deviations  and  converted  to  decibels  (dB)  for  display.  The  signal  types  are  labeled  Bang. 
Pulse,  LFM,  Sumsin  and  PRN. 

The  bang  is  an  exponentially  damped  sinusoid  centered  on  60  Hz.  The  pulse  is  constructed  to  have  a  flat 
(real)  spectrum  from  25  to  75  Hz.  The  LFM  is  a  333-ms  sweep  from  30  to  90  Hz.  Sumsin  is  a  sum  of  two 
sine  waves  at  30  and  33  Hz,  with  the  second  one  phase-shifted  to  make  the  transient  taper  at  its  extremities. 
The  PRN  contains  uniformly  distributed,  zero-mean  white  noise  bandpass  filtered  to  the  band  from  20  to 
70  Hz.  All  of  these  signals  were  zero  padded  with  as  many  zeros  preceding  them  as  there  were  bins  in  the 
nonzero  part  of  the  signal  and  with  the  same  number  of  zeros  following  them. 

A  single  set  of  SNAP-generated  Green’s  functions  from  Section  4.2  (one  for  the  propagation  to  each 
hydrophone)  was  used  to  create  received  signals  at  a  range  of  1 1400-m  for  use  in  this  study.  The  1 1400-m 
range  was  chosen  because  it  presents  significant  multipath  corruption  and  corresponds  to  good  results  for 
the  performance  of  the  Cabrelli  algorithm.  As  an  example  of  the  degree  of  multipath  corruption  in  each  of 
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the  five  signals,  Fig.  6  shows  the  five  simulated  transients  next  to  one  channel  of  received  data  for  the 
11400-m  range. 

Twenty  realizations  of  the  noise  were  mixed  with  each  of  the  five  received  signals  at  a  fixed  SNR.  These 
100  prepared  signals  were  each  run  through  the  Cabrelli  algorithm  for  a  set  of  reasonable  filter  lengths  (the 
l  in  the  definition  of  the  xjk  used  in  Eqs.  (14)  and  (15)),  and  the  optimal  filter  length  was  chosen  for  each 
signal,  based  on  correlations  with  the  true  source.  In  practice,  the  optimum  filter  length  is  unknown  (see 
Section  5,  Discussion),  but  it  is  assumed  to  be  known  in  this  study  to  illustrate  the  best  possible  performance 
for  the  algorithm. 

The  100  resulting  output  signal  estimates  were  compared  with  the  five  trial  transients  using  the  correla¬ 
tion  (Eq.  (20)).  The  500  resulting  correlations  were  used  to  populate  a  confusion  matrix.  A  confusion  matrix 
has  the  signal  type  for  the  unidentified  source  labeling  the  rows  and  classification  choices  labeling  the 
columns. 

Figure  7  shows  the  data  used  to  construct  a  confusion  matrix  for  this  study,  corresponding  to  an  SNR 
of  15  dB.  The  plot  in  each  block  of  this  5x5  matrix  shows  the  correlations  for  that  box  plotted  in  increasing 
order.  For  each  threshold  value,  the  counts  from  each  block  can  be  easily  ascertained  visually.  These  counts 
are  needed  to  calculate  probabilities.  The  diagonal  blocks  represent  correct  classifications  (or  false 
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Fig.  6  —  Five  classes  of  transient  signals  (a)  through  (e),  and  the  received  transients  at 
the  11400-m  range  (receiver  depth  of  96  m)  (0  through  (j) 
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dismissals),  while  off-diagonal  blocks  represent  false  classifications  (or  correct  dismissals).  A  confusion 
matrix  is  judged  “good”  if  correlation  values  in  its  diagonal  blocks  are  large  and  the  correlation  values  in  its 
off-diagonal  blocks  are  small. 

Four  probabilities  can  be  calculated  from  Fig.  7  by  counting  the  numbers  of  correlations  above  and 
below  a  threshold  value  and  normalizing  by  the  number  of  possible  outcomes. 

These  four  probabilities  for  a  given  prepared  signal  type  are: 

P  =  (probability  of  correct  classification) — number  of  correlations  above  threshold  in  diagonal 
block/number  of  realizations. 

P  =  (probability  of  false  classification) — number  of  correlations  above  threshold  in  rest  of  row/ 
(number  of  signal  types- 1  )(number  of  realizations). 

P  =  (probability  of  correct  dismissal) — number  of  correlations  below  threshold  in  rest  of  row/ 
(number  of  signal  types- l)(number  of  realizations),  and 

P  =  (probability  of  false  dismissal) — number  of  correlations  below  threshold  in  diagonal  block/ 
number  of  realizations. 


matrix  shows  the  correlations  over  20  realizations.  Counts  of  correlations  above  and  below  a  given  threshold 
can  be  read  visually  from  this  figure  for  the  construction  of  Cl.OC  curves. 
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If  a  particular  correlation  value  is  greater  than  the  threshold  value,  it  is  counted  as  a  classification.  If  it  is 
less  than  the  threshold  value,  it  is  counted  as  a  dismissal. 

Each  probability  could  be  plotted  against  the  threshold  value  or,  as  is  more  common,  parametric  plots  of 
one  probability  vs  another  could  be  made.  Note  that  the  scales  on  the  plots  of  correlation  values  in  Fig.  7  are 
not  consistent  across  subplots  but  are  scaled  to  be  relevant  for  each  particular  case.  In  many  cases,  the 
correlation  values  are  below  0.8,  which  in  general  represents  poor  correlation  for  these  signals. 

Tables  1  through  10  give  the  confusion  matrices  resulting  from  use  of  the  maximum  correlation  as  the 
classification  criterion  for  the  five  different  SNRs.  Each  element  contains  the  number  of  realizations  (out  of 
20)  for  which  the  signal  heading  each  column  is  chosen  by  the  classifier.  When  all  five  signal  classes  are 
used  for  classification,  as  done  in  Tables  1  through  5,  the  pulse  and  Sumsin  signals  are  properly  classified 
with  high  probability  (implies  high  Pcc )  at  all  SNRs.  In  no  instance  is  the  signal  improperly  dismissed  (im¬ 
plies  low  Pfd).  However,  the  bang,  LFM,  and  PRN  are  almost  as  likely  to  be  falsely  classified  (implies  high 
P.)  as  the  pulse  or  Sumsin,  as  these  two  signals  are  to  be  correctly  classified.  Removing  the  pulse  and 
Sumsin  signals  from  the  classification  classes  results  in  high  P  c  and  low  P  for  the  bang,  but  the  LFM  is 
falsely  classified  as  the  bang  well  over  50%  of  the  time;  it  is  falsely  classified  as  the  PRN  the  remainder  of 
the  time.  The  PRN  is  somewhat  better  with  higher  correct  classifications  at  the  high  SNRs. 


Table  1  —  Five-Class  Confusion  Matrix  for  SNR  =  5  dB 


Bang 

Pulse 

LFM 

Sumsin 

PRN 

Bang 

0 

19 

0 

1 

0 

Pulse 

0 

20 

0 

0 

LFM 

2 

17 

0 

1 

0 

Sumsin 

0 

0 

0 

20 

0 

PRN 

0 

20 

0 

0 

0 

Table  2  —  Five-Class  Confusion  Matrix  for  SNR  =  10  dB 


Bang 

Pulse 

LFM 

Sumsin 

PRN 

Bang 

0 

20 

0 

0 

0 

Pulse 

0 

20 

0 

0 

0 

LFM 

2 

17 

0 

1 

0 

Sumsin 

0 

0 

0 

20 

0 

PRN 

0 

19 

0 

1 

0 

Table  3  —  Five-Class  Confusion  Matrix  for  SNR  =  15  dB 


Bang 

Pulse 

LFM 

Sumsin 

PRN 

Bang 

3 

17 

0 

0 

0 

Pulse 

0 

20 

0 

0 

0 

LFM 

1 

18 

0 

1 

0 

Sumsin 

0 

0 

0 

20 

0 

PRN 

0 

20 

0 

0 

0 
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Table  4  —  Five-Class  Confusion  Matrix  for  SNR  =  20  dB 


Bang 

Pulse 

LFM 

Sumsin 

PRN 

Bang 

0 

20 

0 

0 

0 

Pulse 

0 

20 

0 

0 

0 

LFM 

0 

20 

0 

0 

0 

Sumsin 

0 

0 

0 

20 

0 

PRN 

0 

20 

0 

0 

0 

Table  5  —  Five-Class  Confusion  Matrix  for  SNR  =  25  dB 


Bang 

Pulse 

LFM 

Sumsin 

PRN 

Bang 

1 

19 

0 

0 

o 

Pulse 

0 

20 

0 

0 

0 

LFM 

0 

18 

0 

2 

0 

Sumsin 

0 

0 

0 

20 

0 

PRN 

0 

20 

0 

0 

o 

Finally,  Classification  Operating  Characteristics  (CLOC)  curves  for  the  five  SNRs  are  shown  for  the 
five  signals  in  Fig.  8.  These  curves  include  only  two  of  the  probabilities  (i.e.,  Pcc  and  P  )  and  are  created  by 
varying  a  threshold  correlation  value  and  calculating  the  probabilities  as  defined  above.  The  pulse  and 
Sumsin  signals  in  (b)  and  (d)  are  classified  correctly  with  Pcc  reaching  1 .0,  while  Pfc  is  still  zero.  The  bang  in 
(a)  reaches  a  Pfc  of  between  0.3  and  0.4  before  Pcc  reaches  1.0.  However,  the  remaining  two  signals  (c)  and 
(e)  reach  Pfc  values  of  about  0.7  to  1 .0  and  0.35  to  0.45  before  Pcc  reaches  1 .0. 

5.  DISCUSSION 

The  results  of  the  previous  sections  show  that  Cabrelli’s  algorithm  performs  well  with  complicated 
propagation  Green’s  functions  of  the  type  common  to  tactical  ranges  in  a  shallow-water  environment  over  a 
range  of  reasonable  SNRs.  However,  it  is  quite  sensitive  to  signal  type  and  signal  complexity  that  increases 
with  frequency.  While  this  study  shows  that  there  are  two  signal  types  for  which  the  algorithm  improves 
correct  classification  with  high  probabilities  (pulse  and  Sumsin),  the  degree  to  which  this  will  improve  a 
classifier  for  the  pulse  is  limited  by  high  probabilities  for  false  classification  of  the  remaining  signals.  The 
algorithm  generates  source  signal  estimates  for  the  received  bang.  LFM,  and  PRN  signals  that  are  just  as 
likely  to  be  classified  as  a  pulse  as  is  the  received  pulse.  The  pulse  appears  to  be  the  “default”  estimate. 

Using  only  the  bang,  LFM,  and  PRN  signal  classes,  the  bang  was  always  correctly  classified  as  a  bang 
(see  Tables  6  through  10).  Unfortunately,  the  PRN  was  also  classified  as  a  bang  in  a  significant  number  of 
the  cases  for  the  five  SNRs  (17,  13,  5,  12,  and  9  times  out  of  20.  respectively).  However,  the  PRN  was 
correctly  classified  in  the  remaining  cases.  Even  with  only  three  choices,  the  LFM  was  never  classified  as  an 
LFM  but  as  a  bang  and  occasionally  as  a  PRN. 

The  results  given  in  this  report  are  dependent  on  the  correlation  as  the  classifier.  Therefore,  they  are  not 
sufficient  to  draw  complete  conclusions  about  the  improvements  that  the  Cabrelli  algorithm  may  have  on 
classification.  For  example,  when  the  bang  is  falsely  classified  as  the  pulse,  the  source  estimate  generated 
by  the  Cabrelli  algorithm  is  visually  something  in  between  the  two  signal  types,  as  illustrated  in  Fig.  9. 
Here,  the  source  estimate  derived  from  the  received  bang  in  the  1 1400-m  case  from  the  earlier  section  is 


Prob  Correct  Class 
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(a)  Bang  (b)  Pulse 


(c)  LFM  (d)  Sumsin  (e)  PRN 

Fig.  8  —  CLOC  curves  (the  five  curves  in  each  subplot  represent  SNR  values  of  5, 10, 15, 20,  and  25  dB) 


Table  6  —  Three-Class  Confusion  Matrix  for  SNR  =  5  dB 


Bang 

LFM 

PRN 

Bang 

20 

0 

0 

LFM 

16 

0 

4 

PRN 

17 

0 

3 

Table  7  —  Three-Class  Confusion  Matrix  for  SNR  =  10  dB 


Bang 

LFM 

PRN 

Bang 

20 

0 

0 

LFM 

16 

0 

4 

PRN 

13 

0 

7 

Table  8  —  Three-Class  Confusion  Matrix  for  SNR  =  15  dB 


Bang 

LFM 

PRN 

Bang 

20 

0 

0 

LFM 

12 

0 

8 

PRN 

5 

0 

15 

16 


Pflug  et  at. 


Table  9  —  Three-Class  Confusion  Matrix  for  SNR  =  20  dB 


Bang 

LFM 

PRN 

Bang 

20 

0 

0 

LFM 

12 

0 

8 

PRN 

12 

0 

8 

Table  10  —  Three-Class  Confusion  Matrix  for  SNR  =  25  dB 


Bang 

LFM 

PRN 

Bang 

20 

0 

0 

LFM 

16 

0 

4 

PRN 

9 

0 

11 

more  correlated  with  the  pulse  signal  rather  than  the  bang,  but  other  classification  features  might  easily 
recognize  it  as  the  bang. 

So  far,  the  issue  of  filter  length  has  been  ignored.  There  are  different  approaches  to  dealing  with  this 
unknown  parameter.  The  first  approach  is  to  directly  pass  the  source  estimates  generated  for  each  filter 
length  to  the  classifier.  An  alternate  approach  is  to  cull  the  source  estimate  set  to  a  significantly  smaller  set 
before  passing  to  the  classifier.  For  example,  if  the  hydrophones  are  separated  such  that  the  Green’s  func- 


Fig,  9  —  Source  estimate  generated  by  the  deconvolution  algorithm  for 
the  received  bang  signal  compared  to  the  bang  and  the  pulse 
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tions  are  significantly  different  from  channel  to  channel,  then  the  average  autocorrelation  of  the  received 
signals  will  be  similar  to  the  autocorrelation  of  the  source  signal.  This  average  autocorrelation  can  be  used  to 
reduce  the  size  of  the  solution  set  by  comparing  the  autocorrelations  of  the  extracted  signals  with  the 
autocorrelations  of  the  classification  signal  types.  Also,  exploratory  studies  have  shown  that  for  some  signal 
types,  the  best  source  esimate  often  appears  at  multiple  filter  lengths,  while  the  other  estimates  are  dissimi¬ 
lar.  This  redundancy  can  be  exploited  to  cull  the  solution  set.  However,  any  sort  of  culling  before  classifica¬ 
tion  introduces  more  uncertainty  into  the  problem.  As  long  as  the  classifier  operates  quickly,  it  is  the  most 
reliable  way  to  cull  the  solution  set.  Thus,  at  the  expense  of  higher  P  ,  it  is  perhaps  better  to  pass  the  entire 
source  estimate  set  directly  to  the  classifier. 

Clearly,  the  Cabrelli  algorithm  alone  will  not  be  adequate  as  a  general  multipath  compensation  prepro¬ 
cessor  in  shallow-water  transient  classification.  It  will  have  to  be  aided  in  some  way,  and  this  research  now 
takes  this  direction.  Other  algorithms  under  investigation  use  information  that  is  independent  of  the  infor¬ 
mation  that  Cabrelli’s  algorithm  uses.  For  example,  Rietsch’s  algorithm  [11]  uses  the  fact  that  the  signal  is 
common  to  all  hydrophones  of  an  array,  while  the  Green’s  functions  are  different  from  element  to  element. 
This  algorithm  can  handle  any  signal  (no  matter  how  complicated)  and  any  Green’s  function,  but  it  is  too 
sensitive  to  noise  in  its  present  incarnation.  Ways  are  being  explored  (with  some  success)  to  stabilize  it 
against  noise.  An  algorithm  derived  from  a  joint  “cost”  function  that  uses  stochastic  information  from  Cabrelli’s 
algorithm  and  deterministic  information  from  Rietsch’s  algorithm  (or  other  deterministic  algorithms)  should 
produce  significantly  improved  results  and  be  stable  in  noise. 

6.  CONCLUSIONS 

The  Cabrelli  algorithm  for  blind  deconvolution  has  been  elucidated  and  tested  with  simulated  multipath 
corrupted  transient  data.  It  was  found  to  be  insensitive  to  the  complexity  of  the  Green’s  function  describing 
the  propagation,  relatively  robust  against  noise,  but  very  sensitive  to  signal  type  and  signal  complexity. 
Specifically,  it  was  found  to  be  a  good  preprocessor  for  the  classification  of  pulses  and  sums  of  sines,  but  it 
performed  poorly  with  metallic  bangs,  LFMs,  and  PRNs.  The  probability  of  false  classification  as  a  pulse 
was  unusually  high  since  it  seemed  to  want  to  classify  everything  (except  sums  of  sines)  as  a  pulse.  These 
results  might  not  be  very  reliable  because  an  overly  simplistic  classification  algorithm  was  used  (normalized 
correlation  with  ground  truth).  The  algorithm  should  be  tested  in  conjunction  with  a  classification  algorithm 
more  in  line  with  those  used  in  Fleet  operations.  Finally,  a  brief  description  was  given  of  a  procedure  to 
produce  an  algorithm  that  performs  better  than  this  one. 
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